Credal Classification for Mining Environmental Data

نویسنده

M. Zaffalon

چکیده

Classifiers that aim at doing reliable predictions should rely on carefully elicited prior knowledge. Often this is not available so they should be able to start learning from data in condition of prior ignorance. This paper shows empirically, on an agricultural data set, that established methods of classification do not always adhere to this principle. Common ways to represent prior ignorance are shown to have an overwhelming weight compared to the information in the data, producing overconfident predictions. This point is crucial for problems, such as environmental ones, where prior knowledge is often scarce and even the data may not be known precisely. Credal classification, and in particular the naive credal classifier, are proposed as more faithful ways to represent states of ignorance. We show that with credal classification, conditions of ignorance may limit the power of the inferences, not the reliability of the predictions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tree-Based Credal Networks for Classification

Bayesian networks are models for uncertain reasoning which are achieving a growing importance also for the data mining task of classification. Credal networks extend Bayesian nets to sets of distributions, or credal sets. This paper extends a state-of-the-art Bayesian net for classification, called tree-augmented naive Bayes classifier, to credal sets originated from probability intervals. This...

متن کامل

Credible classification for environmental problems

Classifiers that aim at doing credible predictions should rely on carefully elicited prior knowledge. Often this is not available so they should start learning from data in condition of near-ignorance. This paper shows empirically, on an agricultural data set, that established methods of classification do not always adhere to this principle. Traditional ways to represent prior ignorance are sho...

متن کامل

Comments on "Imprecise probability models for learning multinomial distributions from data. Applications to learning credal networks" by Andrés R. Masegosa and Serafín Moral

We briefly overview the problem of learning probabilities from data using imprecise probability models that express very weak prior beliefs. Then we comment on the new contributions to this question given in the paper by Masegosa and Moral and provide some insights about the performance of their models in data mining experiments of classification.

متن کامل

JNCC2: An extension of naive Bayes classifier suited for small and incomplete data sets

JNCC2 implements the Naive Credal Classifier 2 (NCC2), i.e., an extension of naive Bayes to imprecise probabilities, designed to return robust classification even on small and/or incomplete data sets, which is often the case in environmental case studies.

متن کامل

On Mining Fuzzy Classification Rules for Imbalanced Data

Fuzzy rule-based classification system (FRBCS) is a popular machine learning technique for classification purposes. One of the major issues when applying it on imbalanced data sets is its biased to the majority class, such that, it performs poorly in respect to the minority class. However many cases the minority classes are more important than the majority ones. In this paper, we have extended ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2002

Credal Classification for Mining Environmental Data

نویسنده

چکیده

منابع مشابه

Tree-Based Credal Networks for Classification

Credible classification for environmental problems

Comments on "Imprecise probability models for learning multinomial distributions from data. Applications to learning credal networks" by Andrés R. Masegosa and Serafín Moral

JNCC2: An extension of naive Bayes classifier suited for small and incomplete data sets

On Mining Fuzzy Classification Rules for Imbalanced Data

عنوان ژورنال:

اشتراک گذاری